Reason Why

“At the beginning of FY2020, the team of Marketing, Comms, and Sales were challenged by a new business objective: High Velocity Merchants (HVM). After an initial market research, we realized the potential of this new target and the difficulties of identifying these companies with the tools available at the time. As the information required (company name, investment stage, founders, events, associations, etc) was mostly of opened access, we decided to use data analytics in our favor to create a project that could benefit marketing’s perspective, lead generation, content topics, advertising strategies, branding awareness and events mapping.”

Estefanía Granados

“This mapping exercise not only allows us to better understand the customer profiles we are looking for. Also gives us a concrete guideline to establish an effective communication and resources investment strategy, in order to attract them.”

Ángela Bohorquez


Objectives

High Velocity Merchants (HVM) were initially conceived as companies valued over 1 billion dollars that had not executed a merge or Initial Public Offering (IPO). They usually were known as having passed through multiple funding rounds and be at a late investment stage.

As the concept was relatively new and there was ambiguity about what it means to be “at a late investment stage”, we decided to map available companies at all investment states.

This project is the effort to identify HVM’s behavioural and topic patterns in their global ecosystem of entrepreneurship at an advanced investment level through its points of interaction.


Objectives

  • Identify who are the HVM, where are they and what events they attend to.
  • Identify the conversational topics they reveal on social media.

To pursue these objectives, Fidelio made a digital research based on web sites dedicated to gather companies’ information and events. Angel.co, Crunchbase, 10times and Twitter are the main data souces.

These data sources were used to consolidate a worksheet database with information about:

  • Companies
  • Founders
  • Venture Capitalists
  • Events
  • Institutions

Twitter was used as the main data source for social media. Its content served to model the latent topics on all HVM’s conversations available online.


Methodology

  1. Business Understanding (Jun2019)
  2. Data understanding (Jun2019)
  3. Data preparation (Jul2019)
  4. Data processing (Aug2019)
  5. Insights presentation (Aug2019)

This methodology is inspired in the Cross Industry Standard Process for Data Mining (CRISP-DM).


Technical details

  • Name of the project: HVM Ecosystem Mapping.
  • Data collection date: 15 Jun 2019 - 17 Aug 2019.
  • Company responsible for the study: FIDELIO DIGITAL S A S.
  • Company sponsoring the study: PAYU COLOMBIA S.A.S.
  • Objective group:
    • Companies in multiple investment stages.
    • Founders of those companies.
    • Venture Capitalists investors of those companies.
    • Events worldwide.
    • Institutions organizers of those events.
  • Sample design: targeted unweigthed sampling.
  • Sample framework: angel.co, crunchbase.com, 10times.com, twitter.com.
  • Sample size:
    • 4,205 companies.
    • 5,540 founders.
    • 487 venture capitalists.
    • 8,080 events
    • 2,155 institutions
  • Data collection technique: web scraping.
  • Geographical scope: worldwide.
  • Error margin: does not apply.
  • Delivery report date: Aug 19, 2019.


Technologies used

This work has been developed thanks to open source technologies:

  • Python. Packages: pandas, numpy, matplotlib, wordcloud, nltk, sklearn, collections, gensim, bokeh.
  • R. Packages: knitr, readxl, data.table, plotly, RColorBrewer, DT, bubbles
  • JavaScript
  • HTML
  • CSS

Glosary

  • Word cloud: an electronic image that shows words used in a particular piece of electronic text or series of texts. The words are different sizes according to how often they are used in the text. source


Mapping Tables


Use the navigation bar on your left to step across the tables or select one of the links below:

  1. Companies
  2. Founders
  3. Venture capitalists
  4. Events & Institutions

For each Word Cloud, you will find the semantic roots of every word. We grouped it to preserve consistency between words. For example: “developer”, “development” and “develop” would match into the same root word: “develop”.


Companies



Among the main words identified on the companies’ description we find “platform”, “provid” (meaning provider), “onlin”, “develop” and “servic”. It is implied on the words that companies are mostly related to online platforms that provide services and solutions with technology and data. i.e. mainly digital companies.

Besides columns shown at the table, we gathered 94 columns for each company. Among the most important are: Categories, Funding Status, IPO Status, Facebook, LinkedIn, Twitter, Contact Email, Phone Number, Description, Total Funding Amount, Number of Funding Rounds, Number of Investors, Number of Lead Investors Number of Current Team Members Number of Articles an Number of Events, among many other.


Verticals

In order to give better insights, all categories in this report were merged with PayU’s main verticals:

Additional categories were added in order to preserve consistency among companies and between industries.

PayU’s main verticals represent 69% of the categories in the database. Most of them are Digital Services (28.2%), Direct Selling (17.5%) and Software (16.6%).

The category with most revenue in USD on average is Aerospace ($653M USD), Digital Services ($471M USD) and Fintech ($169M USD). Since Aerospace does not have a significant amout of companies (10), the category with most sales, on average, is Digital Services.

In regard of monthly visitors, Software as a category is significantly higher than the rest with an average monthly visit of 67 millions. The top companies are: YouTube (24 billions), Quora (589 millions) and Zhihu (286 millions).

Government and NGO’s have the most amount of tech products. However, the amount of companies working in those categories is not representative (11 and 4). The category with most tech products is Uber Model Sharing (31).


Location - Country

84.3% of the companies mapped are located in the United States (75.2%), United Kingdom (4%), China (3%) and India (2.1%).

For each country, we mapped the average annual USD revenue, monthly visitors, technological products, team members, events, articles (made by them), number of funding rounds and number of investors. Countries with relatively low amount of companies (<9) are not going to be taken into consideration for this specific analysis.

Finland ($266M USD), India ($258M USD) and United States ($245M USD) take the lead on average income per company.

In regard of average monthly visitors, Indonesia takes the lead with 48M visits. The amount of companies in Indonesia is relatively small (9) so it makes senss to zoom in to the most visited company: Bukalapak. It is an e-commerce company with 82M visits monthly.

Use of technological products inside these companies is vital to garantee a competitive advantage. The ones with the most are Indonesia and the next country, with relatively high companies (81) is India. This country is worldwide known for its computational capabilities and proof of that is the amount of technological products every company uses on average: 28.

Belgium and Israel take the lead on average team members (10 and 9, respectively). Switzerland and US on number of Board Members (6 and 4).

On average, the countries with the most visits to events are United States (6), Sweden (6), France (6) and Belgium (6). This insight can serve as an input to tune the events participation strategy worldwide towards these countries.

Finally, the countries with the most investors, on average, are Sweden (14), Singapore (11) and United States (9).

Location - PayU Regions

On average, India leads the annual revenue with $258.7M USD. Follows USA and Canada with $238.8M USD and Asia with 98.4M USD. Not surprisingly, USA/CA and India lead the average monthly visitors of all regions (18.6M and 9.7M respectively).

On average, India and Asia lead the average funding rounds (6 and 4.4), while India, Brazil and USA/CA lead on average investors per company (9).


Funding

54.8% of companies are part of PayU’s main focus as they are passing through funding stages (Early and Late Stage Venture). Merged & Acquired (M&A) companies take 36.1% of the database.

Public companies (IPO), the ones that list into the stock exchange, represent 1.74% of the data.

We decided to take into consideration for the analysis all companies since the definition of “High Velocity Merchants” is still been tested.

Companies with Private Equity are the ones that have not passed through an investment state, so they are financed by own equity and debt. They can go from small and medium business to big private companies. The average number of investors in this type of companies is 7. Even though they represent 1.26% of the amount in the database, they account for 34% (736M USD) of the Funding Amount mapped.


Team

53% of companies in the database have between 11 and 200 employees and 14% have between 201 and 5.000 employees. Very few (44) have more than 5.000 employees and most of them are in the United States.

The difference between Employees and Team Members is that the latter is the leadership team (VP’s, managers, C-level employees).

Companies between 501 and 1000 are the ones that assists the most to events (13 per company, on average), while companies between 1001 and 5000 publish the most articles, on average.


Contact Information

This barchart is how much filled the contact information in the database is; since not all data is public or is not centralized.


Founders



Founders refer themselves at their job role (like “CEO”, “CTO”, “entrepreneur” or “investor”), work approach (work, design, technolog, develop) and their academic experience (studi, stanford university, University California).


Top Influence Founder

Of the 5,540 founders that are part of this study, Angel.co expose that there are only 5 founders in more than 1 high velocity merchants (Oleg Rogynskyy, Brian Wong, Arjun Dev Arora, David Gutelius, David Cancel) and they are in colors different to green. The circle size increases relative to the number of connections in Angel.co.


Contact Information

Founders personal contact like mail or mobile use to be hidden in their social media profile. Linkedin is the channel with more structured information and more business context use, thus the possibility to get in touch by Sales Navigator (Linked In) is very recommendable.


Venture Capitalists



Venture Capitalists describe themselves in relation to investment as venture capitalists and in relation to what kind of enterprise they are looking for (startup, early stage). They don’t show their academic or work background, neither their funds origin.


Location

USA, India, China, Rusia and UK group 60% of Venture Capitalists. Most of them are in the USA (33%).


Employees

76%% of venture capitals in the database have between 1 and 10 employees. Very few (7) have more than 5.000 employees and most of them are in the United States.


Events & Institutions

Events happening in the next months:


Locations

USA, Canada, China and India are hosts of 3809 events (48% of all global events mapped).


Dates

Due to the fact that all events were gathered during July and August of 2019, most of the events are from those dates. During the next year, April is having the peak of the year with 167 events.


Verticals

The industries with more events are HR, Jobs & Career, Antiques & Philately, Veterinary, Aerospace & Telecommunication. The events with the most visitors are Gifts & Gifting, Fashion & Beauty, Architecture & Designing, IT & Technology and Business Services.


Visitors by Verticals & Regions

On all regions, Direct Selling is the category with most visitors. Travel is the second most visited category in Asia and India and Agtech & Food is the second most visited category among EMEA and SSC.


Institutions

2,155 of institutions are in USA, Canada, India and China,


Companies’ content



Most of the content generated by companies is related to development, software, technology, money, businesses, sites, engagement, and part of it related to women in tech as well.


Verticals’ social status


Content Clustering

Plot

For each company, we took the content of all tweets and then mapped them on a cluster dispersion. Every cluster is built based on words interaction and content.


Topic Modelling

For each topic revealed, the words most relevant to that topic are highlighted. Use the control on the top left to navigate between topics and the control on the right to adjust how relevant is each word on every topic. Relevance metric is to 1 so that all words are relevant to all topics.


Main topics analysis

  • Topic 1: Related to the daily (year, day, time) work and terms such as “help”, “great” and “learn”. Might be related to their intention to help their customers using great productos.
  • Topic 2: Related to the market performance on their industry. Main words: “market”, “busi”, “mobil”, “brand”, “digit”.
  • Topic 3: Related to E-Commerce, Artifitial Intelligence (AI), advertising (email), data and retail. Might be highlighting the impact of automated systems into the retail industry.

These first 3 topics have been related to digital businesses, E-Commerce, Antifitial Intelligence (AI) and their impact around all industries. They are talking from their perspective on how to use their services to solve problems.

  • Topic 4: The word “women” blooms on this specific topic. As this topic is close to the first two implies that companies are talking also about women in tech nowadays (woman, team, today, time).
  • Topic 5: This topic is far from the others and is more related to customer service a how they pursue solving problems. Main words: “pleas(e)”, “help”, “team”, “sorr(y)”, “support”, “contact”.
  • Topic 6: Previous topic was about solving problems. This topic is related to the feelings associated with those problems solved: “love”, “design”, “live”, “style”, “happ(y)”, “beaut(y)”.

In a way, as they are “shouting” how they want to solve their customers’ problems, this can serve as an input on how PayU should communicate with them and which terms to use in that contact.

Most of the other topics are close together so based on PayU’s strategy, one topic might work better than other.

  • Topic 20: relatively away from the others, topic 20 shows signs of human resources and their interest to hire new talent in the digital industry: “mobil(e)”, “app”, “network”, “hire”, “recruit”, “manag(ement)” and “hrtech_hr”.


Next Steps

Mapping the HVM’s ecosystem gives an input to define strategies inside PayU’s core business:

  1. Serves as lead database for commercial boosting (outbound).
  2. Sets a leads database for marketing and communication campaigns (inbound).
  3. Stands the parameters for events participation and profiling PayU’s speakers on each vertical.
  4. Highlights the main conversation topics among HVM’s so that communication can be fluid.
  5. Identifies influencers in the ecosystem so they can be allies in PayU’s strategies.